Optical Scanning System for Surface Inspection

ABSTRACT

In an optical scanning system for detecting particles and pattern defects on a sample surface, a light beam is focused to an illuminated spot on the surface and the spot is scanned across the surface along a scan line. A detector is positioned adjacent to the surface to collect scattered light from the spot where the detector includes a one- or two-dimensional array of sensors. Light scattered from the illuminated spot at each of a plurality of positions along the scan line is focused onto a corresponding sensor in the array. A plurality of detectors symmetrically placed with respect to the illuminating beam detect laterally and forward scattered light from the spot. The spot is scanned over arrays of scan line segments shorter than the dimensions of the surface. A bright field channel enables the adjustment of the height of the sample surface to correct for errors caused by height variations of the surface. Different defect maps provided by the output of the detectors can be compared to identify and classify the defects. The imaging function of the array of sensors combines the advantages of a scanning system and an imaging system while improving signal/background ratio of the system.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of application Ser. No. 10/412,458, filed Apr.10, 2003; which application is a continuation of application Ser. No.09/571,303, filed May 8, 2000, now abandoned; which application is acontinuation of application Ser. No. 08/868,292, filed Jun. 3, 1997, nowU.S. Pat. No. 6,081,325; which application claims the benefit of thefiling date of Application No. 60/018,973, filed Jun. 4, 1996. Theseapplications are hereby incorporated by reference as if fully set forthherein.

BACKGROUND OF THE INVENTION

The present invention pertains to the field of optical surfaceinspection. Specifically, the present invention pertains to illuminationand light collection optics for inspecting semiconductor wafers and thelike.

Monitoring anomalies, such as pattern defects and particulatecontamination, during the manufacture of semiconductor wafers is animportant factor in increasing production yields. Numerous types ofdefects and contamination, especially particles, can occur on a wafer'ssurface. Determining the presence, location and type of an anomaly onthe wafer surface can aid in both locating process steps at which theanomaly occurred and determining whether a wafer should be discarded.

Originally, anomalies were monitored manually by visual inspection ofwafer surfaces for the presence of particulate matter. These anomalies,usually dust or microscopic silicon particles, caused many of the waferpattern defects. However, manual inspection proved time-consuming andunreliable due to operator errors or an operator's inability to observecertain defects. The ever increasing size of the wafer surface, alongwith the decreasing dimensions of the components thereon, resulted in asharp increase in the number of components on the wafer's surface. Theneed for automation became manifest.

To decrease the time required to inspect wafer surfaces, many automaticinspection systems were introduced. A substantial majority of theseautomatic inspection systems detect anomalies based on the scattering oflight. For example, see U.S. Pat. No. 4,601,576 to L. Galbraith,assigned to the assignee of the present invention. These systems includetwo major components: illumination optics and collection-detectionoptics. Illumination optics generally consists of scanning a wafersurface with a coherent source of light, e.g., a laser. Anomaliespresent on the wafer's surface scatter incident light. The collectionoptics detect the scattered light with reference to the known beamposition. The scattered light is then converted to electrical signalswhich can be measured, counted and displayed as bright spots on anoscilloscope or other monitor.

The illumination optics plays a major role in establishing the detectionsensitivity of the inspection system. The sensitivity is dependent uponthe size of the spot scanned on the wafer and the illumination angle.The smaller the spot size, the more sensitive the system is to detectinganomalies. However, decreasing the spot size increases the time requiredto scan the wafer surface and therefore reduces throughput.

The sensitivity of both the illumination and collection-detection opticsis dependent upon the texture of the surface of the wafer illuminated.If the surface illuminated is patterned, this reduces the sensitivity ofthe system because such areas produce scatter which makes it difficultto determine the presence of an anomaly. To abrogate scatter due topatterned features, the angle of incidence of the spot on the surface isincreased, with respect to the normal to the surface. However, too greatof an angle, i.e., a grazing angle with respect to the surface, willalso reduce the sensitivity of the system. Moreover, increasing theangle of incidence, increases the effective size of the spot, therebyreducing the sensitivity of the system. Thus, a trade-off exists betweensensitivity and inspection rate of the system. The sensitivity of thecollection-detection optics is generally a factor of the detector'sazimuthal position with respect to the scanning beam and elevation.

Accordingly, many illumination and collection-detection techniques havebeen proposed that take advantage of the aforementioned concepts. Inaddition, efforts have been made to provide for constant scanning of thewafer's surface to further increase the speed of the inspection. In U.S.Pat. No. 5,317,380, Allemand discloses a beam of laser light brought tofocus as an arcuate scan line on a surface, at a grazing angle ofincidence. A pair of light detectors are provided to collect light whichis scattered away from the beam in a forward direction so that the angleof collection is constant over the entire scan line.

U.S. Pat. No. 4,912,487 to Porter et al. discloses a laser patternwriting and inspection system that illuminates a target surface with anargon ion laser beam. An acousto-optical deflector is driven with achirp signal and placed in the path of the beam to cause it to sweep outraster scan lines. The target is placed on a stage capable ofbi-directional movement. The beam has an angle of incidence normal tothe target and the stage moves so that it is scanned along adjacentcontiguous strips of equal width.

U.S. Pat. No. 4,889,998 to Hayano et al., discloses an apparatus andmethod for detecting foreign particles on a pellicle using a beam oflight that is scanned across the pellicle with light detected by aplurality of detectors grouped in pairs. Two pairs of detectors arepositioned to collect rearwardly scattered light. The difference inintensity of scattered light detected by each detector is monitored,whereby the position of the particle on the pellicle is determined byanalyzing the intensity variations.

In U.S. Pat. No. 4,898,471 to Stonestrom et al., an apparatus and methodfor detecting particles on a patterned surface are disclosed wherein asingle light beam is scanned, at a grazing angle of incidence, acrossthe surface. The surface contains a plurality of identical dies withstreets between them. With the beam scanning parallel to the streets, asingle channel collection system detects scattered light from anazimuthal angle that maximizes particle signals while reducing patternsignals. A processor constructs templates from the detected light whichcorresponds to individual dies and then compares the templates toidentify particles on the dies.

In U.S. Pat. No. 4,617,427, Koizumi et al., a wafer is mounted on a feedstage connected to a rotary drive which provides a constant speedhelical scan of a wafer surface. An S-polarized laser beam is scannedthereon at varying angles of incidence. The angle of incidence isdependent upon whether the wafer is smooth or patterned. A singledetector is positioned perpendicular to the wafer's surface forcollecting scattered light and includes a variable polarization filterthat attenuates scattered light in an S polarization state, when thesurface is patterned, and does not attenuate light if the wafer issmooth.

In U.S. Pat. No. 4,441,124 to Heebner, a laser is scanned over thesurface of a wafer at an angle normal thereto. The laser beam is scannedby deflecting it with a galvanometer and an acousto-optic deflector insynchronization with the scanning beam rate of a video monitor. Aphotodetector employing a ring-type collection lens monitors theintensity of light scattered substantially along the wafer surface. Thisarrangement was employed to take advantage of the finding that apatterned wafer having no particulate matter thereon will scattersubstantially no light along the wafer surface, while a wafer havingparticulate matter on it will scatter a portion of the light impingingthereon along the surface.

Another particle detection apparatus and method is disclosed in U.S.Pat. No. 4,391,524, to Steigmeier et al., wherein a laser beam isscanned at an angle normal to the wafer's surface. In addition torotating, the wafer stage is provided with movement along one axis thatresults in the wafer being scanned in a spiral fashion. A singledetector is positioned perpendicular to the surface to collect scatteredlight. Threshold circuitry is employed to discriminate between thedefects monitored.

It is an object of the present invention to provide a high-speedapparatus which is capable of scanning a laser beam across the surfaceof either a patterned or unpatterned wafer to detect anomalies thereonwith sizes on the order of a fraction of a micron.

It is a further object of the present invention to classify detectedanomalies and determine their size while increasing the confidence andaccuracy of the detection system by reducing false counts.

SUMMARY OF INVENTION

These objects have been achieved with an apparatus and method fordetecting anomalies of sub-micron size, including pattern defects andparticulate contaminants, on both patterned and unpatterned wafersurfaces. For the purposes of this application, a particulatecontaminant is defined as foreign material resting on a surface,generally protruding out of the plane of the surface. A pattern defectmay be in or below the plane of the surface and is usually induced bycontaminants during a photolithographic processing step or caused bycrystal defects in the surface.

One aspect of the invention is directed towards an optical scanningsystem for detection of anomalies, such as particles and pattern defectson a surface, comprising means for directing a focused beam of lightonto a sample surface to produce an illuminated spot thereon and meansfor scanning the spot across the surface along a first scan line. Thesystem further comprises a first detector positioned adjacent to saidsurface to collect scattered light from the spot wherein the detectorincludes a one-dimensional or two-dimensional array of sensors and meansfor focusing scattered light from the illuminated spot at each of aplurality of positions along the scan line onto a corresponding sensorin the array.

Another aspect of the invention is directed towards an optical scanningmethod for detection of anomalies, such as particles and patterns on asurface, comprising the steps of directing a focused beam of light ontoa sample surface to produce an illuminated spot thereon; scanning a spotacross the surface along a first scan line; positioning a first detectoradjacent to said surface to collect scattered light from the spot,wherein the detector includes a one-dimensional or two-dimensional arrayof sensors; and focusing scattered light from the illuminated spot ateach position along the scan line onto a corresponding sensor in thearray.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a simplified perspective plan view of the illumination andcollection optics of the present invention.

FIG. 2 is a top view of the illumination and collection optics shown inFIG. 1.

FIG. 3 is a detailed view showing the scan path of a spot on a wafersurface.

FIG. 4 is a detailed view of a collection channel shown in FIG. 1.

FIGS. 5A-5B are plan views showing a polarization scheme employed by thepresent invention.

FIG. 6 is a graph of an electrical signal amplitude (I) versus beam scanposition (X) on a wafer produced by the method of the present inventionusing the apparatus shown in FIG. 1.

FIGS. 7A-7E are top views of a display derived from a scan of the wafer,as shown in FIG. 3.

FIG. 8 is a plan view of an imaging channel shown in FIG. 1.

FIG. 9A is a schematic view of an elliptical-shaped illuminated area orspot on a surface to be inspected to illustrate the invention.

FIG. 9B is a graphical illustration of the illumination intensity acrossthe width or short axis of the elliptical spot of FIG. 9A for defining aboundary of the spot and to illustrate a point spread function of theillumination beam.

FIG. 9C is a schematic view of three positions of an illuminated spot ona surface to be inspected to illustrate the scanning and data gatheringprocess of the system of this invention.

FIG. 10 shows partially in perspective and partially in block diagramform a system for inspecting anomalies of a semiconductor wafer surfaceto illustrate the preferred embodiment of the invention.

FIG. 11 is a block diagram of the imaging channel two-dimensional arraydetector of FIG. 8 and of a processor for controlling the detector andfor synchronizing the transfer of signals in the detector with thescanning of light beam in FIGS. 1 and 10.

For simplicity in description, identical components are identified bythe same numerals in this application.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention, as shown in FIG. 1 is based on the discovery thatthe scattering cross section of an anomaly on a patterned surface isasymmetrical. This in part is due to the asymmetry of the anomalyitself, or, in the case of particulate contaminants, the pattern onwhich a particulate rests changing the effective scattering crosssection of the particle. Taking advantage of this discovery, a pluralityof detectors are provided that includes groups of collector channelssymmetrically disposed about the circumference of the surface. Althougha greater number of collector channels may be employed in each group,the preferred embodiment uses two groups of two collector channels, 10a-b and 11 a-b, disposed symmetrically about the wafer surface 12 sothat each collector channel within a pair is located at the sameazimuthal angle on opposite sides of the scan line, indicated by theline B. With collector channels positioned symmetrically in the azimuth,a substantial reduction in false counts can be obtained. For example, ananomaly having a symmetrical scattering cross section, will causescattered light to impinge on a pair of collector channels, disposedsymmetrically in the azimuth, with the same intensity. Anomalies with anasymmetrical scattering cross section will impinge on the same pair ofcollector channels with varying intensities. By comparing datarepresenting the intensity of light impinging on symmetrically disposedcollector channels, signals which are in common, such as patternsignals, may be discarded. This provides a high confidence level thatthe resulting signals are in fact anomalies, and not due to randomscattering by surface features. The data from the channels is comparedby performing various algorithms and logical operations, e.g., OR, ANDand XOR. In addition, examining the data concerning the anomalies havingunidentical signals in the two channels allows determining the shapeand/or composition of them.

As shown in FIG. 1, a light source 13, typically a laser, emits a beam14. Beam 14 is directed towards the pre-deflector optics 15, whichconsists of a half wave-plate, a spatial filter and several cylindricallenses, in order to produce an elliptical beam with a desiredpolarization that is compatible with the scanner 16. The pre-deflectoroptics 15 expands the beam 14 to obtain the appropriate numericalaperture. The post-deflector optics 17 includes several cylindricallenses and an air slit. Finally, the beam 14 is brought into focus onthe wafer surface 12 and scanned along the direction, in the plane ofthe wafer surface 12, indicated by B, perpendicular to the optical axisof the beam 14. The type of deflector employed in the apparatus isapplication dependent and may include a polygonal mirror orgalvanometer. However, in the preferred embodiment, deflector 16 is anAcousto-optic Deflector. The wafer surface 12 may be smooth 18 orpatterned 19. In addition to the collector channels 10 a-b and 11 a-b,described above, detector channels are provided which include areflectivity/autoposition channel 20, an imaging channel 21 and analignment/registration channel 22, each of which are discussed morefully below.

The beam 14 has a wavelength of 488 nm and is produced by an Argon ionlaser. The optical axis 48 of the beam 14 is directed onto the wafersurface 12 at an angle, θ. This angle, θ, is in the range of 55-85° withrespect to the normal to the wafer surface 12, depending on theapplication. The scanning means includes the deflector 16 and thetranslation stage 24 upon which the wafer rests. The position of thewafer on the stage 24 is maintained in any convenient manner, e.g.,vacuum suction. The stage 24 moves to partition the surface 12 intostriped regions, shown as 25, 26 and 27 with the deflector 16 moving thebeam across the width of the striped regions.

Referring to FIG. 2, the grazing angle of the beam 14 produces anelliptical spot 23 on the wafer surface 12, having a major axisperpendicular to the scan line. The deflector 16 scans the spot 23across a short scan line equal in length to the width of striped region25 to produce specularly reflected and scattered light. The spot 23 isscanned in the direction indicated, as the stage 24 moves the waferperpendicular to the scan line. This results in the spot 23 movingwithin the striped region 25, as shown in FIG. 3. The preferredembodiment scans in only one direction as indicated by scan path 28.Scan path 28 has an effective start location at 29 and the spot 22 movesto the right therefrom until it reaches the border 31 of striped region25. Upon reaching border 31, the spot 23 moves perpendicular to the scandirection and the spot assumes a new start position 30 and movesparallel to scan line 28, along scan line 32. The deflector 16 continuesto scan the spot 23 in this fashion along the entire length of stripedregion 25. Upon completion of the scan of striped region 25, the stage24 moves the wafer to permit the scanning of the adjacent striped region26. The effective start location 33 is positioned so that the stage 24shall move perpendicular to each scan line in a direction opposite tothat when scanning striped region 24, thereby forming a serpentine scan.This is demonstrated by scan paths 34 and 35. Moving the stage 24 toscan adjacent striped regions in opposite directions substantiallyreduces the amount of mechanical movement of the stage while increasingthe number of wafers scanned per hour.

Referring to FIGS. 1 and 3, light scattered from the wafer surface 12 isdetected by a plurality of detectors, including collector channels 10a-b and 11 a-b. An important aspect of the collector channels is thatthey collect light over a fixed solid angle, dependent upon, inter alia,the elevational and azimuthal angle of the channel. The optical axis ofeach collection channel is positioned at an angle of elevation Ψ in therange of 70-90°, with respect to the normal to the surface 12. Asdiscussed above collector channels 10 a and 11 b are symmetricallypositioned at the same azimuthal angle with respect to beam 14, onopposite sides of the scan line. Collector channels 10 a and 10 b arepositioned, with respect to the beam 14, at an azimuthal angle Φ₁ in therange of about 75° to about 105° to collect laterally scattered light.Laterally scattered light is defined as light scattered at azimuthalangles in the range of about 75° to about 105°, with respect to beam 14.Similar to collector channels 10 a and 10 b, channels 11 a and 11 b arepositioned on opposite sides of the scan line at the same azimuthalangle; however, the azimuthal angles Φ₂ of channels 11 a and 11 b are inthe range of 30° to 60°, to collect forwardly scattered light. Forwardlyscattered light is defined as light scattered at azimuthal angles in therange of 30° to 60°.

Providing the groups of collector channels, at differing azimuthalangles, facilitates classifying detected anomalies, by taking advantageof a discovery that laterally scattered light is more sensitive todetecting pattern defects, and forwardly scattered light is moresensitive to detecting particulate contaminants. To that end, channels10 a and 10 b are positioned to collect laterally scattered light,representing pattern defects, and channels 11 a and 11 b are provided tocollect forwardly scattered light, representing particulatecontamination.

Referring to FIG. 4, each collector channel 10 a-b and 11 a-b includes alens system 113 that collects scattered light. A series of mirrors 114a-c reflect the light so that it is imaged onto a photomultiplier tube(PMT) 115. The PMT 115 converts the light impinging thereon into anelectrical signal having a voltage level that is proportional to thelight intensity. Positioned at the Fourier transform plane is aprogrammable spatial filter 116 and a variable aperture stop 117. Theprogrammable spatial filter 116 allows the system to take advantage ofspatial filtering when periodic features on the surface 12 are scanned.In addition to the angle of elevation and the azimuthal angle of eachchannel, the variable aperture stop permits varying the elevationalcollection angle by limiting the light introduced into the collectorchannel, in accordance with the geometry of the features on the wafersurface 12. Also located proximate to the Fourier transform plane is avariable polarization filter 118. It should be noted, that it is alsopossible to place a PMT directly at the Fourier transform plane.

Referring to FIG. 5, it was found that by employing the followingpolarization schemes, the signal to background of the system could besubstantially improved. To obtain optimum signal to background, thepolarization scheme employed by the system is surface dependent. It mayalso be used to determine the composition of the anomaly, e.g., ascomposed of metallic or dielectric material. With respect to patterndefects, the polarizing element included in the post-scanner optics 17will place the beam 14 in a state of either P or S polarization. A beamis in a state of S polarization when its electrical field isperpendicular to the plane of incidence. The plane of incidence isparallel to the plane of the paper. It is defined by the surface 12,beam 14 and reflected beam 14 b. A vector representation of the beam isshown by a {right arrow over (k)} vector representing the direction ofpropagation. The magnetic field is shown as the {right arrow over (H)}vector. The electric field vector is shown as being perpendicular to theplane of incidence by representing it with a dot {right arrow over (E)}.A beam is in a state of P polarization when the electric field is in theplane of incidence. This is shown in FIG. 5B where the beam 14 is shownin vector form with a propagation vector {right arrow over (k)}, amagnetic field vector shown as a dot {right arrow over (H)} and theelectric field vector {right arrow over (E)}, perpendicular to thepropagation vector {right arrow over (k)}. Referring also to FIG. 4, ifbeam 14 is incident on the surface 12 in an S state of polarization, thevariable polarization filter 118 would allow scattered light in an Sstate of polarization to pass through it and attenuate all otherscattered light. For example, both non-polarized or P polarized lightwould be attenuated and S polarized light would be collected by thecollector channels. Alternatively, optimizing the detection of patterndefects could be accomplished with an S polarized beam 14 and thepolarization filter allowing all scattered light to pass through it. Ifthe beam 14 is in a P polarization state, the variable polarizationfilter 118 would allow P polarized light to pass through it and wouldattenuate all other scattered light. Alternatively, the polarizationfilter could allow all scattered light to be detected when beam 14 is Ppolarized. This also optimizes detection of pattern defects. Similarly,if the beam 14 were incident on the surface 12 with either a left orright handed circular polarization, the collector channels would be verysensitive to detecting pattern defects by allowing the polarizationfilter to pass all the collected light therethrough.

To detect particulate contaminants on a pattern surface, the variablepolarization filter 118 would attenuate scattered light that is not in aP state of polarization, if the beam were S polarized. Were beam 14 in aP state of polarization, the collector channels would collect scatteredlight that was S polarized, whereby the variable polarization filter 118would attenuate all other scattered light impinging on the channel. Fordetecting particulates on a bare surface, beam 14 would be in a P stateof polarization and the collector channels would collect all lightscattered therefrom to maximize the capture rate.

Referring to FIG. 6, an electrical signal 37 is produced by one of theinspection channels corresponding to an intensity I of collectedscattered light as a beam scans over a scan path. The abscissa X of thegraph in FIG. 5 represents the spatial position of the beam along thescan path. Signal 37 is made of a plurality of discrete samples takenduring the scan, e.g., a plurality of scan lines, each of which werescanned at different positions on a surface.

FIGS. 1 and 7A-E show an example of an interchannel communicationscheme. Shown therein is a resulting display of a map constructed by aprocessor 500 from the signals produced by the inspection channels. Forpurposes of this example, FIGS. 7A and 7B represent scattered lightdetected by a pair of collector channels. The light detected from thesurface consists of a plurality of signals, shown as spots 38. Thesespots may represent anomalies or false positives: light detected fromfeatures or other non-anomalies present on the surface. The spots 38 maybe stored digitally in the processor memory at addresses correspondingto spatial positions on the surface. The processor 500 compares the datastored in memory at addresses represented by the map shown in FIG. 7Awith the data stored in memory represented by the map shown in FIG. 7B.The data can be compared by performing various algorithmic or logicaloperations on it. A logical OR operation maximizes the capture rate atthe expense of a potential increase in false counts by storing allanomalies detected between both channels in memory. The composite mapshown in FIG. 7C is the end result of performing a logical OR operationon the data stored in the processor's memory addresses, as representedby the maps shown in FIGS. 7A and 7B. Alternatively, a logical ANDoperation would discard all anomalies that are not common to bothchannels, which is the preferred embodiment. The composite map shown inFIG. 7D is the end result of performing a logical AND operation on thedata stored in the processor's memory addresses, as represented by themaps shown in FIGS. 7A and 7B. An exclusive OR operation discardsanomalies that are detected on both channels, keeping only thoseanomalies which are not commonly detected, as shown in FIG. 7E. These“suspect” particles would merit further examination with, inter alia, ahigh resolution microscope which could be employed on the system.

Referring again to FIG. 6, another manner in which to construct themaps, shown in FIGS. 7A-C, is provided in which only those positionswhere the signal 37 crosses a certain threshold voltage level are storedin memory, while the remaining signal portions are discarded. Forexample, two threshold levels are shown: a fixed threshold level 39, anda variable threshold level 40. At threshold level 39, peaks 41-47 areregistered and stored in memory. At the variable threshold level 40, asshown, only peaks 41, 43 and 45 are stored in memory. Using thethreshold voltage level as shown, fewer positions are registered to forma map thereby making the subsequent processing faster, but at the riskof failing to detect smaller anomalies. The fixed threshold level 39provides a greater number of positions being detected, but making thesystem slower. Typically, the fixed threshold level 39 is preset beforescanning a wafer, and the variable threshold level is derived from thereflectivity/autoposition channel as described below.

Although the above-described example discussed comparing maps fromsignals generated by a pair of collector channels, this is not the onlymanner in which the system may operate. It is to be understood that mapsformed from signals generated by the detector channels may also becompared to identify and classify anomalies, by performing algorithmsand logical operations on the data, as described above. Comparingsignals to a variable threshold level provides an instructive example,because the threshold level is derived from the bright fieldreflectivity/autoposition channel 20, shown in FIG. 1.

The variable threshold level is dependent upon the local reflectivity.To that end, the bright field reflectivity/autoposition channel 20, ispositioned in front of the beam 14 to collect specularly reflectedlight. The bright field signal derived from this channel carriesinformation concerning the pattern, local variations in reflectivity andheight. This channel is sensitive to detecting various defects on asurface. For example, the bright field signal is sensitive torepresenting film thickness variations, discoloration, stains and localchanges in dielectric constant. Taking advantage of bright field signalsensitivity, the bright field signal is used to produce the variablethreshold level 40, shown in FIG. 6. It is also used to produce an errorheight signal, corresponding to a variation in wafer height, which isfed to a z-stage to adjust the height accordingly, as well as tonormalize the collector and detection channel signals, whereby thesignals from the inspection channels each are divided by the brightfield signal. This removes the effect of DC signal changes due tosurface variations. Finally, the bright field signal can be used toconstruct a reflectivity map of the surface. This channel is basicallyan unfolded Type I confocal microscope operating in reflection mode. Itis considered unfolded because the illuminating beam and reflectedbeams, here, are not collinear, whereas, in a typical reflectionconfocal microscope the illuminating and reflected beams are collinear.

Referring to FIG. 8, the imaging channel is shown to include a lensassembly 119 that images scattered light onto a one-dimensional ortwo-dimensional array of sensors 120 having pixels, e.g., charge-coupleddetectors. The array 120 is positioned so that the pixels collect lightscattered by the illuminated spot around directions (e.g. direction 122)normal to the wafer surface 12 with the lens assembly 119 collectingupwardly scattered light. The spot 23 is focused and scanned insynchronism with the transferring of a charge contained in each pixel.This enables charging each pixel 121 independently of the remainingpixels, thereby activating one pixel 122 at a time with each pixelpositioned so as to receive light scattered from a unique area of thesample surface, illuminated by the spot along the scan line. In thismanner each pixel forms an image on the area illuminated by the spot,wherein there is a one-to-one correlation between a pixel and the spotposition along the scan line. This increases the sensitivity of thesystem by improving the signal to background ratio. For example, it canbe shown that for a PMT-based channel, the signal to background isdefined as follows:P _(s) /P _(b) =σ/A _(b) hwhere P_(s) is the optical power scattered by a particle, P_(b) is thebackground optical power, A_(b) is the area of the beam on the surfaceand σ and h are constants. This shows that the ratio of the scatteringcross section to the area of the beam determines the signal tobackground ratio.

With an imaging-based channel, all the scattered power from an anomalyis imaged onto one array element. The power distributed in background,however, is imaged over a range of elements, depending upon themagnification of the system. Assuming a linear magnification M, at theimage plane the background power over an area is as follows:M²A_(b)providing an effective background power per array element asP _(b) =P _(i) hA _(c) /M ² A _(b)where A_(c) is the area of an array element. Therefore, the signal tobackground ratio is given by the following:P _(s) /P _(b) =M ² σ/A _(c) hThis shows that the signal to background ratio is independent of thespot diameter, providing an improved signal to background ratio givenby:i=M ² A _(b) /A _(c)

If imaging is not desired, another PMT-based collector channel similarto the one shown in FIG. 4 may be employed in lieu of the imagingchannel, to collect upwardly scattered light.

FIG. 9A is a schematic view of an elliptical-shaped illuminated area (orspot) of a surface inspected by the system of this invention toillustrate the invention. As explained, the laser beam illuminating thesurface inspected approaches the surface at a grazing angle, so thateven though the illumination beam has a generally circularcross-section, the area illuminated is elliptical in shape such as area210 in FIG. 9A. As known to those skilled in the art, in light beamssuch as laser beams, the intensity of the light typically does not havea flat distribution and does not fall off abruptly to zero across theboundary of the spot illuminated, such as at boundary 210 a of spot 210of FIG. 9A. Instead, the intensity falls off at the outer edge of theilluminated spot at a certain inclined slope, so that instead of sharpboundaries such as boundary 210 a illustrated in FIG. 9A, the boundaryis typically blurred and forms a band of decreasing intensity atincreasing distance away from the center of the illuminated area.

In many lasers, the laser beam produced has a Gaussian intensitydistribution, such as that shown in FIG. 9B. FIG. 9B is a graphicalillustration of the spatial distribution of the illumination intensityin the Y direction of a laser beam that is used in the preferredembodiment to illuminate spot 210 of a surface to be inspected as shownin FIG. 9A, and thus is also the illumination intensity distributionacross spot 10 in the Y direction. As shown in FIG. 9B, the illuminationintensity has been normalized so that the peak intensity is 1, and theillumination intensity has a Gaussian distribution in the X direction aswell as in the Y direction. Points 212 and 214 are at spatial locationsy1 and y5 at which points the illumination intensity drops to 1/e² ofthe peak intensity, where e is the natural number. The spot 210 isdefined by the area within a boundary 10 a where the illumination is1/e² of that of the maximum intensity of illumination at the center ofthe spot. The lateral extent of the spot 210 may then be defined to bethe boundary 210 a. The size of the spot is then defined by means of theboundary. Obviously, other definitions of the boundary of a spot and ofspot size are possible, and the invention herein is not restricted tothe above definition.

To maintain uniform detection sensitivity, the scanning light beam ispreferably caused to scan short sweeps having a spatial span less thanthe dimension of the surface it is scanning, as illustrated in thepreferred embodiment in FIG. 10, where these short sweeps are notconnected together but are located so that they form arrays of sweeps.Preferably, the lengths of the sweeps are in the range of 2-25 mm.

The surface inspection system of this application will now be describedwith reference to FIGS. 1 and 10. As shown in FIG. 10, system 200includes a laser 222 providing a laser beam 224. Beam 224 is expanded bybeam expander 226 and the expanded beam 228 is deflected byacousto-optic deflector (AOD) 230 into a deflected beam 232. Thedeflected beam 232 is passed through post-AOD and polarization selectionoptics 234 and the resulting beam is focused by telecentric scan lens236 onto a spot 210 on surface 240 to be inspected, such as that of asemiconductor wafer, photomask or ceramic tile, patterned orunpatterned.

In order to move the illuminated area that is focused onto surface 240for scanning the entire surface, the AOD 230 causes the deflected beam232 to change in direction, thereby causing the illuminated spot 210 onsurface 240 to be scanned along a sweep 250. As shown in FIG. 10, sweep250 is preferably a straight line having a length which is smaller thanthe dimension of surface 240 along the same direction as the sweep. Evenwhere sweep 250 is curved, its span is less than the dimension ofsurface 240 along the same general direction. After the illuminated spothas traversed along sweep 250, surface 240 of the wafer is moved by XYstage 24 (FIG. 1) parallel to the X axis in FIG. 10 so that theilluminated area of the surface moves along arrow 252 and AOD 230 causesthe illuminated spot to scan a sweep 250′ parallel to sweep 250 and inan adjacent position spaced apart from sweep 250 along the X axis toscan an adjacent sweep at a different X position. As described below,this small distance is preferably equal to about one quarter of thedimension of spot 210 in the X direction. This process is repeated untilthe illuminated spot has covered strip 254; at this point in time theilluminated area is at or close to the edge 254 a. At such point, thesurface 240 is moved by XY stage 24 along the Y direction by about thelength of sweep 250 in order to scan and cover an adjacent strip 256,beginning at a position at or close to edge 256 a. The surface in strip256 is then covered by short sweeps such as 250 in a similar manneruntil the other end or edge 256 b of strip 256 is reached at which pointsurface 240 is again moved along the Y direction for scanning strip 258.This process is repeated prior to the scanning of strips 254, 256, 258and continues after the scanning of such strips until preferably theentire surface 240 is scanned. Surface 240 is therefore scanned byscanning a plurality of arrays of sweeps the totality of whichsubstantially covers the entire surface 240.

The deflection of beam 232 by AOD 230 is controlled by chirp generator280 which generates a chirp signal. The chirp signal is amplified byamplifier 282 and applied to the transducer portion of AOD 230 forgenerating sound waves to cause deflection of beam 232 in a manner knownto those skilled in the art. For a detailed description of the operationof the AOD, see “Acoustooptic Scanners and Modulators,” by MiltonGottlieb in Optical Scanning, ed. by Gerald F. Marshall, Dekker 1991,pp. 615-685. Briefly, the sound waves generated by the transducerportion of AOD 230 modulate the optical refractive index of anacoustooptic crystal in a periodic fashion thereby leading to deflectionof beam 232. Chirp generator 280 generates appropriate signals so thatafter being focused by lens 236, the deflection of beam 232 causes thefocused beam to scan along a sweep such as sweep 250 in the mannerdescribed.

Chirp generator 280 is controlled by timing electronics circuit 284which in the preferred embodiment includes a microprocessor. Themicroprocessor supplies the beginning and end frequencies f1, f2 to thechirp generator 280 for generating appropriate chirp signals to causethe deflection of beam 232 within a predetermined range of deflectionangles determined by the frequencies f1, f2. The illumination sensoroptics 20 and adaptive illumination control 292 are used to detect andcontrol the level of illumination of spot 210. The optics 20 andadaptive illumination control 292 are explained in detail in U.S. Pat.No. 5,530,550.

Detectors such as detectors 10 a, 10 b, 11 a, 11 b of FIGS. 1 and 10collect light scattered by anomalies as well as the surface and otherstructures thereon along sweeps such as sweep 250 and provide outputsignals to processor 500 in order to detect anomalies and analyze theircharacteristics.

FIG. 9C is a schematic view of three positions of the illuminated areaor spot on a surface to be inspected to illustrate the scanning and datagathering process of system 200. As shown in FIG. 9C, at one instant intime, beam 238 illuminates an area 210 on surface 240. Spot 210 isdivided into sixteen areas by grid lines x1-x5, y1-y5, where such areasare referred to below as pixels. In this context, the term “pixel” isdefined by reference to the taking of data samples across the intensitydistributions along the X and Y axes, such as that in FIG. 9C, and byreference to subsequent data processing. The pixel that is bounded bygrid lines x2, x3 and y2, y3 is pixel P shown as a shaded area in FIG.9C. If there is an anomaly in this pixel P, and if the lightilluminating pixel P has the intensity distribution as shown in FIG. 9Bwith a high intensity level between grid lines y2 and y3, lightscattered by the anomaly will also have a high intensity. However, asthe beam moves along the Y axis so that the area 210′ is illuminatedinstead, pixel P will still be illuminated but at the lower intensitylevel of that between grid lines y1 and y2; in reference to FIG. 9B, theintensity of the illumination is that between grid lines y1 and y2 inFIG. 9B. Therefore, if the sampling rate employed by the data processor500 in FIGS. 1 and 10 for processing light detected by the collection orcollector channels 10 a, 10 b, 11 a, 11 b is such that a data sample istaken when the illuminating beam is in position 210 and when theilluminating beam is in position 210′, then two data samples will berecorded. Thus, for any pixel such as P, a number of data points will betaken, one when the illumination is at a higher level as illustrated bydata point D2 in FIG. 9B and another one when the illumination is at alower level, illustrated at data point D1 in FIG. 9B. If position 210 isnot the starting position of the sweep 250, then two prior samples wouldhave been taken prior to the time when the illuminating beam illuminatesthe surface 240 in position 210, so that the processor would haveobtained two more data samples at points D3, D4 corresponding to theprior positions of the illuminating beam when light of intensity valuesbetween grid lines y3, y4 and between y4, y5, respectively, illuminatessuch pixel P (grid lines y1 through y5 would, of course, move with thelocation of the spot). In other words, four separate data samples atpoints D1-D4 would have been taken of the light scattered by an anomalypresent in pixel P as the illumination beam illuminates pixel P whenscanning along the Y direction.

In most laser beams, the beam intensity has a Gaussian intensitydistribution not only in the Y direction but also in the X direction.For this reason, after the illuminating beam completes the scanningoperation for scanning a sweep such as sweep 250 as shown in FIG. 10,and when the illuminating beam returns to position 274 for scanning theadjacent sweep 253 as shown in FIG. 10, it is desirable for theilluminated area along sweep 250′ to overlap that of sweep 250 so thatmultiple samples or data points can again be taken also along the Xdirection as well as along the Y direction. Therefore, when theillumination beam is scanning along sweep 250′ from starting position274 as shown in FIG. 10, the area illuminated would overlap spot 210;this overlapping spot is 210″ as shown in FIG. 9C, where the spot 210″is displaced along the X direction relative to spot 210 by one quarterof the long axis of the ellipses 210 and 210″.

Detector 120 includes a one-dimensional or two-dimensional array ofsensors. To enable time delayed integration as described below, detector120 employs a two-dimensional array of sensors as illustrated in FIG.11.

In reference to FIGS. 8 and 9C, in the preferred embodiment, the lensassembly 119 focuses the light scattered from only a portion of theilluminated spot 210 to a corresponding sensor in a two-dimensionalarray of sensors 120. By focusing the light scattered by only a portionof the illuminated spot onto a sensor, the sensitivity of the detectionsystem of FIG. 8 is enhanced as compared to a system where lightscattered by the entire spot is focused to a sensor. In the preferredembodiment, the lens assembly 119 focuses the light scattered from apixel, such as pixel P, towards a corresponding sensor in the array 120.As noted above, each pixel on the surface inspected will be illuminatedfour times in four adjacent and consecutive scans along the Y axis.Thus, in regard to pixel P, it was illuminated during the scan prior tothe sweep 250, during sweep 250, during sweep 250′ and the sweepsubsequent to sweep 250′. Furthermore, the focusing of light from only aportion of the spot to a sensor also enables time delay integration tobe carried out to enhance the signal-to-ratio in a manner describedbelow.

For the purpose of illustration, it is assumed that when spot 210 isscanned along the sweep immediately prior to sweep 250, light scatteredfrom the pixel P is focused by lens assembly 119 onto sensor 121(1)(3)of detector 120 in FIG. 11. Light scattered by pixels adjacent to andhaving the same Y coordinates as P will be focused by assembly 119 toother sensors in the linear array or line 121(1) of sensors in FIG. 11.In order for the spot 210 to be then subsequently scanned along sweep250, XY stage 24 moves the wafer surface by a distance substantiallyequal to ¼ of the length of the long axis of spot 210, so that the lensassembly 119 will now focus the light scattered by pixel P onto sensor121(2)(3) instead of 121(1)(3), and light scattered by P's adjacentpixels and with the same Y coordinates to sensors along line 121(2). Asshown in FIG. 11, sensor 121(l)(3) is electrically connected to sensor121(2)(3) (e.g. by a wire); in the same vein, the remaining sensors inline 121(1) are similarly electrically connected to correspondingsensors in line 121(2) as shown in FIG. 11. Similarly, the sensors inline 121(2) are electrically connected to corresponding sensors in line121(3) and so on for all adjacent pairs of lines of sensors in detector120.

To enable time delayed integration, processor 500 causes the signal insensor 121(1)(3) obtained by detecting light scattered by pixel P duringthe previously described scan to be transferred to sensor 121(2)(3), sothat the signal obtained during the prior scan will be added to thatobtained by sensor 121(2)(3) from detecting the light scattered by pixelP during sweep 250. Similarly, processor 500 causes the thus accumulatedsignal in sensor 121(2)(3) to be transferred to sensor 121(3)(3) priorto the sweep 250′, so that the signal thus accumulated can be added tothat obtained by sensor 121(3)(3) by detecting the light scattered frompixel P during sweep 250′. In this manner, time delayed integration isperformed by accumulating the signals obtained from light scatteringfrom pixel P during four sequential sweeps and is read out as the outputsignal for such pixel. The same can be done for other pixels on thesurface of the wafer. While in the preferred embodiment, the amount ofoverlap and the sampling rate are controlled so that each illuminatingspot is divided into 16 pixels, it will be understood that the spot maybe divided into a smaller or greater number of pixels by altering theamount of overlap between sequential sweeps and by altering the samplingrate; such and other variations are within the scope of the invention.

The above-described process may be performed for all of the pixels inthe illuminated spot where processor 500 simply causes all of thesignals in each linear array or line of sensors, such as line 121(1) tobe transferred to corresponding sensors in the next line 121(2), andthis process is carried out for all of the lines, from line 121(1) tothe next to the last line 121(N-1), so that time delay integration isperformed for all of the pixels. The number of sensors in each line ispreferably large enough to cover all the pixels in each sweep. In orderto avoid edge effects, it may be desirable to include enough lines ofsensors to cover all of the possible positions of the pixels in theilluminated spot along the X direction of the wafer.

As described above, signals obtained by light scattering from the pixelare accumulated over four sequential sweeps. The final accumulatedsignal is then read out by processor 500 as the output of detector 120for such pixel. Processor 500 then constructs a defect map from atwo-dimensional array of such accumulated signals from the outputs ofdetector 120. Such map may be compared to the defect maps obtained byprocessor 500 from detectors 10 a, 10 b, 11 a and 11 b to obtain an AND,a union and an XOR map for the purpose of identifying anomalies. Thus,an AND map would comprise only of anomalies present in a map from onedetector and in a map or maps from one or more of the remainingdetectors. A union map comprises anomalies present in at least one ofthe maps of two or more detectors. An XOR map comprises anomaliespresent in the map of only one detector but not in the map or maps ofthe remaining detectors. As noted above, the above maps are useful forclassifying defects. Thus, if an anomaly is present in the map of onedetector but not in the map of the other symmetrically placed detector,then the anomaly is probably not symmetrical. Or if an anomaly ispresent in the map of detectors 10 a, 10 b for detecting laterallyscattered light but not in the maps of detectors 11 a, 11 b fordetecting forward scattered light, then the anomaly may be more likelyto be a pattern defect than a particle.

The XY stage 24 is controlled by a controller (not shown) incommunication with processor 500. As this controller causes the stage 24to move the wafer by a quarter of the X dimension of the spot 210, thisis communicated to processor 500, which sends control signals todetector 120 to cause a transfer of signals between adjacent lines ofsensors and sends control signals to timing electronics 284 as shown inFIG. 10. Electronics 284 in turn controls the chirp rate of chirpgenerator 280 so that the transfer of signals between adjacent lines ofsensors in detector 120 will have occurred prior to the scanning of theilluminated spot.

In the preferred embodiment, the illuminated spot has a spot size whoseminimum dimension is in the range of about 2 to 25 microns.

Referring again to FIG. 1, an alignment and registration channel 22 isprovided. Instead of the design in FIGS. 8 and 11, the channel 21 mayalso have the same design as a basic collection channel 10 a, 10 b and11 a, 11 b, but it is positioned in the plane of incidence so that thesignal produced from the patterns or features on the wafer's surface isat a maximum. The signal obtained is used to properly align the wafersurface 12 so that the streets on the features are not oblique with thescan line. This also reduces the amount of signal collected by thecollector channels, resulting from scattering by patterns.

In operation, the beam 14 is scanned over the surface 12, producing bothscattered and specularly reflected light, which are simultaneouslydetected. The light scattered laterally, forwardly and upwardly issimultaneously detected by the collector channels and the imagingsystem. The specularly reflected light from the wafer's surface 12 isdetected by the bright field reflectivity/autoposition channel 20. Lightdetected by the inspection channels is converted into electrical signalswhich are further processed by dedicated electronics, including aprocessor 500. The processor 500 constructs maps from the signalsproduced by the inspection channels. When a plurality of identical diesare present on the wafer surface 12, a detection method may be employedwhereby periodic feature comparisons are made between adjacent die. Theprocessor compares the maps from the inspection channels either in theanalog domain or digitally, by performing logical operations on thedata, e.g., AND, OR and XOR, in the manner described above, to detectanomalies. The processor forms composite maps, each representing thedetected anomalies by a single group of symmetrically disposed collectorchannels. The composite maps are then compared so that the processor mayclassify the anomalies as either a pattern defect or particulatecontamination. Typically, the wafer surface 12 has been aligned so thatthe streets on the die are not oblique with respect to the scan line,using the information carried by the electrical signal produced by thealignment/registration channel. Proper alignment is a critical featureof this invention, because periodic feature comparison is performed tolocate anomalies.

While the above described apparatus and method for detecting anomalieshave been described with reference to a wafer surface, it can easily beseen that anomaly detection is also possible for photomasks and othersurfaces, as well as producing reflectivity maps of these surfaces. Theinvention is capable of detecting anomalies of submicron size andaffords the added advantage of classifying the type of anomaly andidentifying its size and position on the surface. This information ishighly useful to wafer manufacturers as it will permit locating the stepin the wafer manufacturing process at which point an anomaly occurs.

1. An optical scanning system for detection of anomalies on a surfacecomprising: optics directing a focused beam of radiation onto a samplesurface to produce an illuminated spot thereon; one or more sensor(s);one or more optical element(s), each element collecting radiationscattered from the illuminated spot on the surface along a channel anddirecting the collected and scattered radiation to one of the sensor(s),causing each of the sensor(s) to provide output signals in responsethereto, each of the sensor(s) sensing radiation scattered from thesurface in directions away from a specular reflection direction anddifferent from those of radiation sensed by the other sensor(s); abright field detector detecting a specular reflection of the radiationin the beam from the illuminated spot on the surface to provide anoutput signal; and a device causing relative motion between the beam andthe surface so that the beam is caused to illuminate different parts ofthe surface and so that the sensor(s) and the detector provide outputsignals in response to radiation from different parts of the surfaceilluminated by the beam.